Frontiers in Applied Mathematics and Statistics — Latest Matching Preprints

1

Identification of a Fractional Model for an Outbreak of the Dengue Fever

Cresson, J.; Pere, M.; Szafranska, A.

2026-05-27 epidemiology 10.64898/2026.05.26.26354120 medRxiv

Top 0.3%

1.2%

Show abstract

This work focuses on the global and partial identification problem for fractional differential equations. We provide a general numerical procedure based on global and local optimization algorithms with two refinements for biological systems that ensure solution positivity and homogeneous parameter units. The method is applied to a new fractional model of Dengue outbreak called the Fractional Homogeneous Nishiura (FHN) model, calibrated using data of newly infected people in Cape Verde. We show that our identification method yields a better fit between data and model solutions than previous approaches and that our FHN model captures the dynamics of Dengue more closely than existing systems.

2

A Supervised Learning Framework for Stroke Hospitalization Factors Selection Using the Lasso-MIDAS Model

Li, Q.; Wang, L.

2026-05-20 cardiovascular medicine 10.64898/2026.05.15.26353365 medRxiv

Top 0.3%

0.9%

Show abstract

Stroke, as an acute cerebrovascular disease with significant public health implications, is influenced by a complex interplay of meteorological conditions, air quality, and socioeconomic factors. However, the inherent challenges of mixed-frequency data from diverse sources and high-dimensional variable spaces limit the effectiveness of traditional regression models. This study develops a Lasso-MIDAS model framework to identify the key multidimensional drivers of stroke admissions. Using this approach, 21 candidate variables encompassing meteorological, environmental, and economic indicators were screened. The empirical results identified 11 core influencing factors. In the meteorological and environmental dimensions, Wind Speed, Carbon Monoxide (CO), and Sulfur Dioxide (SO2) were identified as significant positive drivers, with Temperature Difference also positively correlating with admission risks. Conversely, Nitrogen Dioxide (NO2) exhibited a negative correlation, potentially reflecting behavioral adaptation and exposure reduction during peak pollution periods. In the socioeconomic dimension, the Consumer Price Index (CPI) for Food, Tobacco, and Alcohol emerged as a major risk factor, highlighting the impact of living cost pressures on public health. The findings demonstrate the superiority of the Lasso-MIDAS model in handling large-scale healthcare data. It effectively addresses the frequency mismatch problem while enhancing the robustness of causal identification through variable shrinkage. These conclusions provide a scientific basis for health authorities to establish early warning systems and optimize public health policy interventions.

3

Enhancing dengue diagnosis and surveillance by integrating machine learning technologies with the NS1 rapid test kit

Hwang, C.-K.; Chen, Y.-W.; WANG, Y.-T.; Ho, T.-S.; Oyang, Y.-J.

2026-05-06 health informatics 10.64898/2026.05.05.26352445 medRxiv

Top 0.5%

0.7%

Show abstract

BackgroundDengue has been a major health threat globally in recent years. In particular, dengue incidences continue to increase annually and the epidemic area has expanded primarily due to global warming. Therefore, effective case detection and surveillance strategies are crucial to tackle this global health challenge. In clinical practice, the rapid test kit detecting dengue non-structural protein 1 antigen and commonly referred as NS1, is widely employed for early diagnosis. However, real-world studies revealed that the sensitivity of the NS1 test kit ranged from approximately 61% to 95%. Since early diagnosis is really critical for disease surveillance in the early stage of a dengue epidemic, scientists have been working hard to develop novel diagnosis methods that can provide higher sensitivity levels. Methodology/Principal FindingsIn response to this challenge, in this study, we have developed a novel diagnosis procedure that integrates machine learning technologies with the NS1 test kit. Our experimental results revealed that we would be able to raise the sensitivity of the dengue diagnosis procedure to higher than 99% by incorporating machine learning based prediction models to screen the suspected patients with a negative NS1 result. Furthermore, the relative risks between the suspected patients who were predicted to be positive and those who were predicted to be negative exceeded 4.8. Conclusions/SignificanceThese results illustrate that the proposed approach provides an effective and efficient diagnosis procedure to address the global health challenge caused by spread of dengue. Author SummaryThis study has aimed to enhance surveillance of the dengue disease by integrating machine learning technologies with the rapid test kit commonly employed in early diagnosis. In clinical practice, the NS1 rapid test kit is widely employed for early diagnosis. However, real-world studies revealed that a certain percentage of the patients with a negative NS1 test result, ranging from 5% to 39%, were actually infected by dengue. Since early diagnosis is critical for disease control in the early stage of a dengue epidemic, scientists have been working hard to tackle this challenge. Based on this observation, this study was launched to investigate the effects of incorporating machine learning based prediction models to further screen those patients with a negative NS1 test result. The experimental results revealed that the proposed approach was able to identify over 99% of the patients who were infected by the dengue disease. Furthermore, the risk of the suspected patients who were predicted to be positive was 4.8 times higher than the risk of those who were predicted to be negative. The experimental results illustrate that the proposed approach provides an effective and efficient diagnosis procedure to enhance surveillance of the dengue disease.

4

An Interpretable Multimodal Framework for Student Mental Health Risk Assessment Using Temporal Embeddings and Fuzzy Inference

Shah, A.; Mehta, A.; Bhensdadia, C. K.

2026-05-20 health informatics 10.64898/2026.05.16.26352630 medRxiv

Top 0.5%

0.6%

Show abstract

Mental health challenges among university students have increased due to academic pressure, lifestyle changes, and continuous digital engagement. Existing approaches for mental health assessment often rely either on self-reported psychological scales or isolated behavioral indicators, limiting their ability to capture complex temporal and contextual patterns. This study proposes an interpretable multimodal framework for student mental health risk assessment using behavioral sensing, academic information, ecological momentary assessments (EMA), and psychometric survey data. A bidirectional Long Short-Term Memory autoencoder is employed to learn latent temporal representations from day-level behavioral sequences, while graph embeddings capture structural relationships among students using similarity-based neighborhood graphs. These representations are fused with academic and survey-derived features and reduced using Principal Component Analysis and Uniform Manifold Approximation and Projection. K-means clustering is then applied to identify behaviorally distinct student groups. Experimental analysis on the StudentLife dataset demonstrates meaningful clustering performance with a Silhouette Score of 0.4209 and Adjusted Rand Index stability of 0.6869. The identified clusters correspond to low-risk, moderate-risk, and high-risk behavioral profiles. To improve interpretability and practical usability, a fuzzy inference system is introduced to compute mental risk, academic risk, and wellbeing indices using psychometric indicators including PHQ-9, PSS, PANAS, VR-12, and Big Five personality traits. The results demonstrate the potential of combining multimodal behavioral modeling with interpretable fuzzy reasoning to support early mental health risk assessment in educational settings.

5

Non Newtonian Blood Rheology Significantly Alters Hemodynamic Predictions During Cardiac Looping: A Computational Study

Watson, M. C.; Kemmerling, E. C.; Black, L. D.

2026-05-19 developmental biology 10.64898/2026.05.15.725470 medRxiv

Top 0.6%

0.5%

Show abstract

Hemodynamic forces play a key role in early cardiac morphogenesis, yet many computational studies assume Newtonian blood behavior. Here, we evaluate the impact of nonNewtonian shearthinning rheology on flow patterns, pressure distributions, and wall shear stress (WSS) during cardiac looping using idealized threedimensional models of the embryonic heart tube. Five geometries representing progressive looping stages, from a linear tube to an Sshaped configuration with ventricular ballooning, were analyzed under pulsatile flow using both Newtonian and powerlaw viscosity models. Across all stages, Reynolds numbers (Re {approx} 1-7) and Womersley numbers (Wo {approx} 0.3) indicated laminar, quasisteady flow consistent with embryonic conditions. Incorporating shearthinning rheology produced substantial deviations from Newtonian predictions, with peak systolic WSS differing by up to [~]40% and pressure drops by up to [~]20%. These effects were most pronounced in regions of increased curvature and geometric complexity. These findings demonstrate that nonNewtonian rheology significantly influences predicted hemodynamic environments during cardiac looping and should be incorporated into computational models aimed at understanding mechanobiological regulation of early heart development.

6

Modeling the Impact of Exposed Cases in a Hantavirus Outbreak on a Cruise Ship

Cui, J.

2026-05-12 epidemiology 10.64898/2026.05.08.26352718 medRxiv

Top 0.6%

0.5%

Show abstract

The emergence of a hantavirus variant aboard a commercial cruise ship presents a significant public health concern. This study develops a discrete-time stochastic Susceptible-Exposed-Infectious-Recovered-Dead model to estimate transmission dynamics, hidden exposed infections, and outbreak risk among passengers and crew. Epidemiological parameters and latent disease states were inferred using an Ensemble Adjustment Kalman Filter calibrated to reported case data from WHO and ECDC situation reports. The estimated basic reproduction number was 2.76, with a 95% confidence interval of 2.52-2.99, indicating substantial potential for sustained onboard transmission before strict quarantine measures. Simulations further suggest that several exposed individuals may remain unidentified during the early outbreak phase, creating a hidden reservoir that symptom-based surveillance alone may fail to detect. These findings highlight the importance of rapid surveillance, widespread testing, targeted quarantine, and active monitoring of exposed individuals in confined travel settings. The proposed modeling framework can support timely outbreak assessment and intervention planning for infectious-disease events in similarly dense and spatially constrained populations.

7

Positive Registration Rate as a Key Determinant of COCOA Effectiveness: Empirical Evidence from Individual-Level Key-Match Data during the Sixth and Seventh COVID-19 Waves in Japan

Nakagawa, S.; Kumagai, S.; Yamamoto, A.

2026-05-08 health informatics 10.64898/2026.05.06.26352506 medRxiv

Top 0.6%

0.5%

Show abstract

BackgroundCOCOA, Japans Bluetooth-based COVID-19 contact tracing app, was widely regarded as ineffective due to persistently low key-match counts. However, this assessment may have conflated two distinct phenomena: (1) a structurally suppressed positive registration rate caused by administrative friction in the HER-SYS linkage, and (2) genuine epidemiological inefficacy. ObjectiveTo empirically examine whether the correlation between individual COCOA key-match counts and regional COVID-19 case numbers depended on positive registration rate, using a unique longitudinal dataset from a single observer with a rigorously controlled behavioral pattern. MethodsThe corresponding author (S.N.) recorded daily key-match counts from his personal iPhone from January 10 to October 8, 2022, encompassing the Sixth Wave (January 10-April 20, 2022) and Seventh Wave (July 9-September 2, 2022). Daily reported COVID-19 cases in Tokyo were obtained from publicly available NHK data. Pearson correlation coefficients were calculated for each wave period separately. ResultsDuring the Sixth Wave, no meaningful correlation was observed between key-match counts and daily case numbers (r2 = 0.018, p = 0.059, n = 194). In contrast, during the Seventh Wave, a strong positive correlation emerged (r2 = 0.530, p < 0.001, n = 56). This correlation disappeared abruptly after September 12, 2022, coinciding with Japans revision of the mandatory full case reporting (Zenshu Todokedashi) policy, which substantially reduced positive registrations in COCOA. ConclusionsCOCOAs utility as an individual infection risk indicator was critically dependent on positive registration rate rather than app installation rate. These findings provide the first real-world empirical evidence supporting the threshold effect predicted by prior simulation studies, and offer important lessons for the design of digital tools in future pandemic preparedness.

8

SEIR-IoT cyber-physical architecture with dual parametric coupling for epidemic scenario simulation using synthetic biomedical signals

Martinez Campo, S. D.; Campo-Ariza, F. M.; Martinez Campo, J. A.; Cormane, M.

2026-05-10 epidemiology 10.64898/2026.05.06.26352603 medRxiv

Top 0.8%

0.2%

Show abstract

This study presents a proof-of-concept cyber-physical architecture integrating a SEIR epidemiological model (Susceptible-Exposed-Infectious-Recovered), implemented in MATLAB, with a simulated Internet of Things (IoT) acquisition and transmission stage based on the ESP32 microcontroller and the ThingSpeak platform. The system generates synthetic biomedical signals of body temperature and peripheral oxygen saturation (SpO2), structured across three levels: circadian variation, scheduled pathological episodes, and Gaussian noise. These signals feed a dual parametric coupling function that dynamically updates the SEIR transmission parameter as a combined function of body temperature and oxygen saturation deviations from their clinical reference values. The proposed architecture is organized into four functional phases: measurement, communication, computational processing, and feedback. Five simulated clinical scenarios were evaluated, ranging from normal conditions (T = 36.5 {degrees}C, SpO2 = 97%) to fever with severe hypoxia (T = 38.5 {degrees}C, SpO2 = 88%), yielding basic reproduction number (R0) values between 4.20 and 5.38, and peak infected proportions between 29.9% and 35.2% of the simulated population (N = 1,000). A sensitivity analysis on the coupling coefficients, with {+/-}50% variation from nominal values, showed that the oxygen saturation coefficient is the most influential parameter on R0 (range = 0.76) compared to the thermal coefficient (range = 0.42), with monotonic and predictable behavior across the entire evaluated parametric space. The primary contribution of this work is system integration: we propose a reproducible platform connecting biomedical simulation, IoT communication, and epidemiological modeling through parametric coupling in a controlled environment. All data used are entirely synthetic; a retrospective calibration with real Colombian data from the first epidemic wave of 2020 confirmed the epidemiological consistency of the model, with a calibrated R0 of 1.85 and a Pearson correlation of 0.930. Results should be interpreted as evidence of architectural feasibility, not as clinical or epidemiological validation. Author SummaryThe COVID-19 pandemic made it clear that epidemiological surveillance systems need tools that combine accessible technology with mathematical models capable of anticipating disease spread. In this work, we built a proof-of-concept platform connecting three elements: a low-cost electronic sensor based on the ESP32 microcontroller, a cloud communication platform (ThingSpeak), and a mathematical model that simulates how an epidemic spreads through a population. The sensor generates synthetic data on body temperature and oxygen saturation that, through a mathematical formula we designed, dynamically modify the rate of contagion in the model. We evaluated five clinical scenarios, ranging from normal conditions to fever with severe hypoxia, and analyzed how sensitive the results are to changes in the system parameters. We found that oxygen saturation has a greater influence on the estimated contagion potential than body temperature. Although all data are synthetic, this platform demonstrates that it is possible to integrate low-cost sensors with epidemiological models in real time, opening a viable pathway for early warning systems in resource-limited settings.

9

How to Monitor Physical activity in pregnant women? Questionnaire and accelerometer: stages of building a virtual assistant

Perdona, G. C.; da Costa, T. C.; da Silva, C. M.; de Fazio, R. B.; Zanutto, N. T.; Lopes, C. E. C. E.; Facci, L. B.

2026-05-18 health informatics 10.64898/2026.05.07.26343713 medRxiv

Top 0.8%

0.2%

Show abstract

Introduction: Physical activity during pregnancy can be tracked directly by accelerometer measurements and indirectly by validated questionnaires. Considering the advancement of the Internet of Things (IOT), managing and/or monitoring physical activities can be better explored to analyze individuals, as well as indirectly compare the intensity and domains of physical activities carried out by pregnant women. The project, called 'EVA'(Expert Virtual Assistant), suggests combining several fields of knowledge to obtain better information about physical activity during pregnancy, surpassing the claim made in previous research that studying and measuring the duration of daily physical activities in pregnant women is a challenge. Objective: In the present study, we present the results of the first stage of the EVA project, which aims to develop a Virtual Assistant (VA) in Portuguese, providing examples of health management features for monitoring Physical Activity measurements for pregnant women assisted in the Unified Health System (SUS) and the adaptation of the Pregnancy Physical Activity Questionnaire (PPAQ). Methods and Analysis: The methods used were developed in two stages: adapting the physical activity questionnaire and building the Virtual Assistent to monitor physical activities. Thirty pregnant women who used the Unified Health System (SUS) in the city of Ribeir&atildeo Preto, Brazil participated in the study. The pregnant women wore sensor wristbands (accelerometers) and answered the sociodemographic, lifestyle and physical activity questionnaires via an application developed for this study. Results: The questionnaire used was the PPAQ adapted for Brazilian pregnant women. The most important changes were in the occupational domain for the house cleaning and in sedentary behavior activities. In the pilot study, it was observed that pregnant women spend more energy at home and in light and moderate intensity activities. textbfConclusion:This study made important contributions to evaluating PA in pregnant women. The proposal and studies for the construction of the AV-EVA, the inclusion of a specific occupational domain for pregnant women with domestic occupations and the new cutoff points for PA intensity measurements obtained via accelerometers.

10

A statistical analysis of pulse transit time captured using pressure sensors at the human radial artery of the wrist

Rao M, S.; Khezrimotlagh, D.

2026-05-20 health informatics 10.64898/2026.05.14.26353264 medRxiv

Top 0.9%

0.2%

Show abstract

Non-invasive wrist pulse monitoring has been integrated into various medical systems for cardiovascular assessment. However, different definitions of pulse transit time are used in the literature, and their statistical behavior when measured locally at the wrist using pressure sensors has not been systematically examined. Wearable wristbands designed to measure pulse transit time (PTT) have emerged as valuable tools for evaluating cardiac activity. While several algorithms have been developed to predict blood pressure using PTT, it is well recognized that PTT and its inverse parameter, pulse wave velocity (PWV), exhibit temporal variability. In this study, PTT was explicitly measured at the wrist's radial artery to investigate its statistical variation and relationship with different arterial pressures. The experiment exhibits two distinct methodologies for PTT computation using onset-based and peak based measurements. Data were recorded across five cuff pressure levels at 20, 40, 60, 80, and 100 mmHg using the pulse pressure sensor (PPS). PTTonset time shows lower coefficient of variation as compared to PTTpeak time within the 100 mmHg pressure range. The weak correlation coefficient is recorded between PTT values. However, dynamic time warping (DTW) analysis revealed a notable similarity in the time series of PTTonset and PTTpeak, regardless of the applied pressure level. For the multi participant dataset, the mean DTW distances ranged from 0.029 to 0.046 across the tested cuff pressures, illustrating consistent similarity between PTTonset and PTTpeak over time. The objective of this study is to examine the statistical behavior, stability, and temporal similarity of the two commonly used PTT definitions when measured at the radial artery using pressure sensors. Statistical analysis shows consistent differences between the two PTT definitions across participants. PTTonset shows lower variation than PTTpeak. However, PTTpeak requires simpler computation and produces fewer detection errors, while PTTonset provides lower statistical variation.

11

Understanding Disordered Eating Attitudes and Patterns in University Students and the Relationship to Campus Dining Services

Bartling, B. A.

2026-05-15 health informatics 10.64898/2026.05.11.26352946 medRxiv

Top 1%

0.1%

Show abstract

University Students are particularly vulnerable to disordered eating behaviors (DEB) and attitudes (DEA). This study expands upon the knowledge base of DEA and DEB in university students by employing a netnography as a precursor to the main study to establish the following research questions: What is the relationship between the perceived quality of dining services and DEA? What is the relationship between the perceived availability of dining services and DEA? And lastly, how does prior experience with dining services affect eating patterns and attitudes toward food? The first study utilized a netnographic approach in order to evaluate issues with university dining services, leading to the design of the second study. Students at an upper Midwestern university (n=88) were surveyed via convenience sampling. Eating attitudes, eating behaviors, and relationships with dining services were measured. A statistically significant relationship between the availability of services and the DEA was found. A statistically significant relationship between the availability of services and risk behaviors was found. However, no statistically significant correlation existed between first-year dependence on on-campus dining services and risk behavior related to eating disorders or eating attitudes. Based on this, we know the quality of nutrition and the availability of services impacted students eating attitudes and behaviors, not inherent dependence.

12

The Verification Gap: Artificial Intelligence Adoption, Hallucination Awareness, and Verification Practices Among Early Career Medical Researchers in Pakistan

Sajjad, M.

2026-05-30 health informatics 10.64898/2026.05.28.26354373 medRxiv

Top 1%

0.1%

Show abstract

Artificial intelligence (AI) tools have been rapidly adopted by medical researchers, yet whether early career researchers in low and middle income countries possess the awareness and habits needed to use these tools safely remains poorly documented. This study characterized AI adoption patterns, hallucination awareness, and verification and disclosure practices among early career medical researchers in Pakistan. A cross sectional anonymous online survey was conducted among medical students, house officers, residents, physicians, and faculty involved in research or academic work across Pakistan (May 2026). Descriptive statistics and chi square tests were applied to 373 eligible responses. AI use was near universal (99.7%), with 60.3% using AI tools daily. The most commonly reported tool in this sample was Claude (40.5%), followed by ChatGPT (29.2%) and Perplexity (26.0%), though this ranking likely reflects sampling characteristics. Despite high adoption, 59.2% typically did not verify AI outputs before use, and 40.2% had never heard that AI can generate fabricated scientific references. In behavioral vignettes, 36.5% assumed convincing AI generated references were authentic, and 54.2% would continue using remaining AI content after discovering one fabricated reference. Formal research training was strongly associated with consistent disclosure (51.7% vs. 17.1%; chi square=48.43, p less than 0.001). Role, daily use frequency, and research training were not significantly associated with verification behavior. Early career medical researchers in Pakistan demonstrate high AI adoption alongside incomplete hallucination awareness and infrequent verification, a pattern that may carry implications for research integrity. Formal training was the only factor significantly associated with consistent disclosure. Integration of AI literacy into medical curricula and institutional governance frameworks merits consideration.

13

A Consensus-Driven Stacking Ensemble Framework for Interpretable Cardiovascular Risk Prediction and Clinical Deployment

Sozol, S. S.; Dev Nath, B. C.; Fahim, F. M. S.; Suzana, N. N.; Mirza, J. F.; Ahmmed, S.; Zohra, F.-T.; Zafr, A. H. A.; Uddin, M. N.; Mondal, M. R. H.; Hoque, A. S. M. L.

2026-05-26 health informatics 10.64898/2026.05.18.26352989 medRxiv

Top 1%

0.1%

Show abstract

Machine learning (ML) is being considered to help diagnose cardiovascular diseases (CVD). Still, challenges like inconsistent and limited datasets, limited infrastructure, and global inequalities lead to the need for a reliable and practicable ML solution. This paper presents an ML-driven framework for predicting CVD risk scores and classifying status. Several data preprocessing techniques, including multiple imputation by chained equations (MICE), outlier removal, are considered. In addition, hyperparameter tuning is performed with the GridSearchCV tuning technique. Moreover, a consensus-driven five-feature selection method is applied to identify optimal predictors. The dataset used in this study contains healthcare records related to future CVD risk scores, comprising 1,529 patient records with 22 features. The optimized stacked ensemble model is applied to the dataset and achieves a cross-validated coefficient of determination value of 98.13% for CVD risk score regression. Comparative evaluation with other ML models confirmed improved accuracy, efficiency, and interpretability. The explainable AI technique SHAP is applied to interpret predictions and highlight key risk factors. Moreover, a deployment-ready web platform with multi-role access has been developed that demonstrates clinical applicability. The proposed framework offers a reliable and interpretable tool for early detection of CVD and personalized risk assessment. In the future, this work can be extended to integrate longitudinal data, medical imaging, and deep learning to improve generalizability and strengthen real-world impact.

14

Artificial Intelligence Driven Support and Self Care Competence as Determinants of Medication Adherence in Diabetes Care, A Cross-sectional Nigerian Study

Onah, C.; Ajonye, A. A.

2026-05-07 health informatics 10.64898/2026.05.06.26352516 medRxiv

Top 1%

0.1%

Show abstract

Medication adherence among patients with diabetes remains suboptimal in low- and middle-income countries, including Nigeria. Emerging digital health interventions such as AI-powered virtual support may be associated with improved adherence behaviours. This study examined self-care competence and perceived AI-powered virtual support as predictors of medication adherence among patients with diabetes. A cross-sectional survey was conducted among 450 patients recruited through multistage sampling across hospitals in Benue State, Nigeria. Standardised measures of self-care competence scale, perceived AI support scale, and medication adherence scale were analysed using correlation and regression analyses. Results showed that, self-care competence significantly predicted medication adherence (R2 = .161), although some components (glucose management, physical activity, healthcare use) showed negative associations. Perceived AI-powered support demonstrated stronger predictive power (R2 = .328), with social presence ({beta} = .311, p < .001) and social interactivity ({beta} = .142, p < .01) emerging as key predictors. The combined model explained 36.3% of variance (R2 = .363). In conclusion, perceived AI-powered virtual support, particularly socially interactive features, plays a significant role in enhancing medication adherence and may complement traditional self-care strategies. It is recommended that clinicians should therefore adopt a hybrid care model that integrates traditional patient education with AI-assisted interventions. This approach can help bridge gaps caused by high patient loads and limited consultation time, while also enhancing personalised care.

15

Machine Learning and Explainable AI for Multi-State Classification of Malaria Transmission Dynamics in Kenya

Gogo, J. A.; Wanyonyi, M.

2026-05-12 health informatics 10.64898/2026.05.09.26352789 medRxiv

Top 1%

0.1%

Show abstract

Malaria remains a major public health challenge in sub-Saharan Africa, with pronounced spatial and temporal variation in transmission intensity that complicates effective control strategies. Accurate classification of transmission states is essential for guiding targeted interventions and strengthening early warning systems. This study develops a machine learning framework for the classification of malaria transmission states in Kenya using monthly panel data from 47 counties spanning the period 2015 to 2025. Transmission was categorised into four operationally relevant states based on incidence thresholds. Four supervised learning models, namely multinomial logistic regression, random forest, extreme gradient boosting, and support vector machine, were trained using temporally lagged features and evaluated under a forward chaining validation scheme to preserve temporal structure. Model performance was assessed using accuracy, macro averaged F1 score, Matthews correlation coefficient, and Brier score, complemented by calibration analysis. Extreme gradient boosting achieved the best overall performance, with accuracy of 0.9918, macro averaged F1 score of 0.9647, and Matthews correlation coefficient of 0.9831, alongside the lowest Brier score of 0.0031, indicating highly reliable probability estimates. Feature importance analysis revealed that lagged incidence, vegetation index, precipitation, and insecticide treated net coverage were the most influential predictors. Partial dependence analysis demonstrated nonlinear relationships and clear seasonal patterns in transmission dynamics. The findings show that machine learning approaches can accurately classify malaria transmission states while providing interpretable and well calibrated outputs for decision making. This framework offers a practical tool for supporting malaria surveillance and resource allocation. Further validation in different epidemiological settings is recommended to assess generalisability.

16

A Three-Layered Agent-Based Model of Adult Hippocampal Neurogenesis (HANG-AB3L) with Stochastic Cell Fate Determination

Oz, P.; Atbasi, A.

2026-05-12 developmental biology 10.64898/2026.05.08.723711 medRxiv

Top 1%

0.1%

Show abstract

Hippocampal adult neurogenesis (HANG) is a highly regulated process where neural stem cells progress through distinct stages--from Type 1 radial glia-like cells to mature neurons--via a complex series of proliferative and differentiative divisions. While recent in vivo imaging has provided valuable insights to cellular processes, the exact relationship between individual cell-fate decisions and long-term population stability remains difficult to quantify empirically. In this study, we utilized an agent-based (AB) model to simulate the stochastic dynamics of the hippocampal neurogenic niche. Our results demonstrate that while individual progenitor lineages exhibit high variability and probabilistic division symmetries (proliferative symmetric, asymmetric, and differentiative symmetric), the system achieves deterministic stability as the initial progenitor density increases. We found that the T1 progenitor pool follows a negative exponential decay profile, with its longevity primarily dictated by the differentiation rate (d,0). Critically, the terminal output of immature neurons (CIN,t) was non-linearly coupled to the proliferative capacity of transit-amplifying cells (pp,0); even marginal increases in symmetric proliferative divisions resulted in an exponential expansion of the neuronal pool. These findings suggest that the homeostatic maintenance of the hippocampal niche is governed by a kinetic tuning of division probabilities, providing a theoretical bridge between single-cell stochasticity and robust tissue-level output.

17

Canine Rabies in NDjamena: A Metapopulation SEIR Model Incorporating Vaccination and Inter-Patch Distances

Djimramadji, H.; Koutou, O.; Dawe, S.

2026-05-12 epidemiology 10.64898/2026.05.08.26352733 medRxiv

Top 1%

0.1%

Show abstract

Canine rabies persists in NDjamena (Chad) despite vaccination campaigns exceeding 70% coverage, suggesting a role for dog mobility and spatial heterogeneity. We propose a metapopulation SEIR model incorporating distance-modulated dog movements and an explicit vaccinated class. Analysis of the isolated patch establishes global stability of the disease-free equilibrium via a Lyapunov function. For the metapopulation, a composite Lyapunov function shows that elimination is governed by a reproduction number [R]v. Calibrated with field data (2012-2022), simulations reveal that uniform vaccination of both patches reduces [R]v by 46% (from 2.84 to 1.52) but does not achieve elimination, while targeted strategies are less effective. These results demonstrate that exhaustive vaccination coverage across the entire urban network and increased vaccination intensity are necessary to eliminate canine rabies in NDjamena. Our model provides a quantitative framework for planning effective control strategies.

18

Geometric Kinematics of Human Eyes

Turski, J.

2026-05-10 neuroscience 10.64898/2026.04.10.716809 medRxiv

Top 2%

0.0%

Show abstract

In previous studies by the author on binocular vision with the asymmetric eye (AE), which models a healthy human eye with misaligned optical components, the results were primarily presented in the Rodrigues vector (RV) framework and supported by simulations and 3D visualizations in GeoGebras dynamic geometry environment. In this paper, the novel geometric kinematics of the human eye, that is, the eye with misaligned optics, and simplified assumptions about the eye rotations (the eyes translational movements are disregarded), are developed within the framework of rigid-body rotations. The originality of the analysis lies in a precise geometric decomposition of a full rotation of the eyes posture into a torsion-free rotation (the geodesic part) and a torsional rotation (the non-geodesic extension of the geodesic part). This decomposition is extended to the corresponding decomposition of the angular velocity. A novel derivation of the eyes angular velocity from the RV formulation of the eye kinematics is proposed.

19

The covariance matrix of metapopulation disease models and applications to early warning signals

Looker, J.; Rock, K. S.; Dyson, L.

2026-05-12 epidemiology 10.64898/2026.05.08.26352721 medRxiv

Top 2%

0.0%

Show abstract

Infectious disease time series often show signs of epidemic transitions, such as the peaks and troughs of the time series. In these time series, key system parameters can lead to catastrophic changes in the dynamical system behaviour (often called critical transitions). Modellers have increasingly shown that early warning signals can anticipate these transitions, both critical and non-critical, in infectious disease time series. Existing methods, however, generally focus on univariate time series data, or ignore spatiotemporal patterns that may be present as a disease spreads through a population. Recent ecological literature developments expand existing temporal and spatial methods to consider the covariance matrix of multiple, related time series. However, many of these proposed signals still make an assumption of stationary time series/system equilibrium. Whilst often true in ecological modelling, disease systems are seldom at equilibrium. In this paper, we propose the usage of the eigendecomposition of the non-stationary covariance matrix as a more suitable early warning signal for epidemiological data. We first analyse the expected trends in the eigenvalues and eigenbasis of the covariance matrix on approach to a transition. Next we apply these methods to a spatially-structured susceptible-infectious-recovered model to explore how the eigenbasis may provide extra information to modellers. Finally, we test these methods on SARS-CoV-2 case data during the 2020-2021 pandemic period in England.

20

Winter forecasting of respiratory viruses in Victoria Australia

Henderson, A. S.; Moss, R.; Adekunle, A. I.; Ye, H.; O'Hara-Wild, M.; Eales, O.; Senior, K. L.; Tobin, R.; Windecker, S. M.; golding, N.; Robinson, E.; Strachan, J.; Hyndman, R. J.; Dawson, P.; McCaw, J.; McBryde, E.; Shearer, F. M.

2026-05-21 epidemiology 10.64898/2026.05.18.26353544 medRxiv

Top 2%

0.0%

Show abstract

Temperate regions of the world, such as southern Australia, often experience increased health burden from respiratory pathogens during winter. The ability to forecast short-term trends in cases of these pathogens is of significant interest to public health. Across the 2024 southern hemisphere winter period, the Australia--Aotearoa Consortium for Epidemic Forecasting and Analytics (ACEFA) ran a pilot respiratory virus forecasting initiative in collaboration with the Victorian Department of Health. Each week from the 9th of May 2024 through to 12th September 2024, the consortium solicited 28-day forecasts of daily case incidence for influenza, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and respiratory syncytial virus (RSV) from multiple research groups. Four component model forecasts were contributed by three different research groups, with a fourth group utilising the component forecasts to generate ensemble forecasts (making a total of six models, four component models and two ensembles). Here we statistically evaluated the performance of each forecast and a baseline model against the observed case data. The two ensemble models were found to be frequently the top performing models. All models performed worse than the baseline model around the epidemic peaks for each pathogen.